Efficient Construction of Comprehensible Hierarchical Clusterings
نویسندگان
چکیده
Clustering is an important data mining task which helps in nding useful patterns to summarize the data. In the KDD context, data mining is often used for description purposes rather than for prediction. However, it turns out diicult to nd clustering systems that help to ease the interpretation task to the user in both, statistics and Machine Learning elds. In this paper we present Isaac, a hierarchical clustering system which employs traditional clustering ideas combined with a feature selection mechanism and heuristics in order to provide compre-hensible results. At the same time, it allows to eeciently deal with large datasets by means of a preprocessing step. Results suggest that these aims are achieved and encourage further research.
منابع مشابه
Integrating Declarative Knowledge in Hierarchical Clustering Tasks
The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsuper-vised learning task, clustering appears to be one of the tasks that more beneets might obtain from prior knowledge. In this paper, we propose a method for providing declarative prior knowledge to a hierarchical clustering system stressing the interactive component. Pre...
متن کاملLearning Multiple Hierarchical Relational Clusterings
Three important generalizations of the basic clustering problem are relational, hierarchical, and multiple clustering. This paper proposes the first approach to clustering that unifies all three. We describe a general probabilistic model for relational clustering, and show that flat, hierarchical and multiple relational clustering models are special cases. This paper also describes an efficient...
متن کاملOptimization and Simplification of Hierarchical Clusterings
Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a sys...
متن کاملExploring Eecient Attribute Prediction in Hierarchical Clustering
This work explores the feasibility of constructing hierarchical clusterings minimizing the expected cost of exploiting these clusterings for a prediction task. Particularly, we focus on gaining eeciency by means of reducing the number of features used to describe each node in the hierarchy. To explore a number of diierent hierarchical clusterings we use the Isaac clustering system, which can se...
متن کاملMultiDendrograms: Variable-Group Agglomerative Hierarchical Clusterings
MultiDendrograms is a Java-written application that computes agglomerative hierarchical clusterings of data. Starting from a distances (or weights) matrix, MultiDendrograms is able to calculate its dendrograms using the most common agglomerative hierarchical clustering methods. The application implements a variable-group algorithm that solves the non-uniqueness problem found in the standard pai...
متن کامل